Parallel Text Corpora
Human Translated / Transcreated Parallel Text Corpora
The Mother Tongue Parallel Text Corpus of India Vol. I
The Mother Tongue Parallel Text Corpus of India Vol.IEnglish and 147 mother tongues of India | 5,332 sentencesThe Mother Tongue Parallel Text Corpus of India Vol.I comprising English and 147 mother tongues of India. Each corpus comprising a total of 5,332 sentences, systematically structured based on 152 grammatical categories. The parallel corpus contains the following languages:1.Assamese, 2.Bengali, 3.Bodo/Boro, 4.Dogri, 5.Gujarati, 6.Hindi, 7.Kannada, 8.Kashmiri, 9.Konkani, 10.Maithili, 11.Malayalam, 12.Manipuri, 13.Marathi, 14.Nepali, 15.Odia, 16.Punjabi, 17.Sanskrit, 18.Santhali, 19.Sindhi, 20.Tamil, 21.Telugu, 22.Urdu, 23.Anal, 24.Angami, 25.Apatani, 26.Are, 27.Awadhi, 28.Bagheli/BaghelKhandi, 29.Bagri, 30.Bagri Rajasthani, 31.Balti, 32.Bhadrawahi, 33.Bharmauri/Gaddi, 34.Bhojpuri, 35.BilaspuriKahluri, 36.Brajbhasha, 37.Bundeli/Bundelkhandi, 38.Chakru/Chokri, 39.Chambeali/Chamrali, 40.Chang, 41.Chhattisgarhi, 42.Chirr, 43.Chungli, 44.Churahi, 45.Coorgi/Kodagu, 46.Deori, 47.Dhundhari, 48.Dimasa, 49.Gangte, 50.Garhwali, 51.Garo, 52.Gujari, 53.Gujjari/Gujar/Gojri, 54.Halabi, 55.Handuri, 56.Hara/Harauti, 57.Haryanvi, 58.Hindi Multani, 59.Irula/IrularMozhi, 60.Kabui, 61.Kangri, 62.Kachchhi, 63.Karbi/Mikir, 64.Khandeshi, 65.KhariBoli, 66.Khasi, 67.Khezha, 68.Khiemnungan, 69.Khortha/Khotta, 70.Kisan, 71.Kodava, 72.Kokbarak, 73.Kolami, 74.Koli, 75.Kom, 76.Konda, 77.Konyak, 78.Koya, 79.Kudubi/Kudumbi, 80.Kuki, 81.KurmaliThar, 82.Ladakhi, 83.Lepcha, 84.Liangmei, 85.Limbu, 86.Lotha, 87.Lyngngam, 88.Magadhi/Magahi, 89.Malvi, 90.Mao, 91.Mara, 92.Maram, 93.Maring, 94.Mech/Mechhia, 95.Mewari, 96.Mewati, 97.Miri/Mishing, 98.Mishmi, 99.Mizo, 100.Mongsen, 101.Monpa, 102.Mundari, 103.Muwasi, 104.Nawait, 105.Nimadi, 106.Nissi, 107.Nocte, 108.Pahari, 109.Paite, 110.Palmuha, 111.Pania, 112.Paola, 113.Pawari/Powari, 114.Phom, 115.Pnar/Synteng, 116.Pochury, 117.Purkhi, 118.Rai, 119.Rajasthani, 120.Reang, 121.Rengma, 122.Rongmei, 123.Sadan/Sadri, 124.Sambalpuri, 125.Sangtam, 126.Saurashtra/Saurashtri, 127.Sema, 128.Shina, 129.Sirmauri, 130.Sugali, 131.Surjapuri, 132.Talgalo (galo), 133.Tangkhul, 134.Thado/Thadou, 135.Tibetan, 136.Tikhir, 137.Tripuri, 138.Tulu, 139.Vaiphei, 140.Wagdi, 141.Wancho, 142.Yimchungre, 143.Yerava, 144.Yerukala/Yerukula, 145.Zeliang, 146.Zemi, 147.Zou The price indicated corresponds to a single language component. The total payment will be determined based on the number of language components requested by the seeker. For any research-based citations, please use the following citations:Dr. Narayan Kumar Choudhary, Writtik Bhattacharya, Dr. Saritha S.L., Dr. Amudha R., Dr. Sajila S.,Dr. Satyaendra Kumar Awasthi, Dr. Rejitha K .S., Amom Nandaraj Meetei, Dr.Vijayalaxmi F Patil, Dr. Shahnawaz Alam, Yumnam Premila Chanu, Saurabh Varik, Dr. Mansoor Khan, Chetan Baji, Sonali Sutradhar, Umesh Chamling Rai, Bhageshree K Khandale, Dr. Zargar Adil Ahmad, Dr. Modugu Kasimbabu, Dr.Kamaraj S., Syeda Mustafiza Tamim, Dr. Shalinder Singh, Hemlata Daimary, Poulami Das, Shivangi Priya, Neha Dixit, Anand Jain, Abhishek Avtans, Akanksha Srivastava, Prangshu Manjul, Ankita Tiwari, Prof. Shailendra Mohan et.al. 2025. The Mother Tongue Parallel Text Corpus of India Vol. I Central Institute of Indian Languages, Mysore. 978-93-48633-08-8.Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2025. LDC-IL Corpus Insights. Central Institute of Indian Languages, Mysore. 978-93-48633-33-0...